reject inference
Balancing Performance and Reject Inclusion: A Novel Confident Inlier Extrapolation Framework for Credit Scoring
Ribeiro, Athyrson Machado, Raimundo, Marcos Medeiros
Reject Inference (RI) methods aim to address sample bias by inferring missing repayment data for rejected credit applicants. Traditional approaches often assume that the behavior of rejected clients can be extrapolated from accepted clients, despite potential distributional differences between the two populations. To mitigate this blind extrapolation, we propose a novel Confident Inlier Extrapolation framework (CI-EX). CI-EX iteratively identifies the distribution of rejected client samples using an outlier detection model and assigns labels to rejected individuals closest to the distribution of the accepted population based on probabilities derived from a supervised classification model. The effectiveness of our proposed framework is validated through experiments on two large real-world credit datasets. Performance is evaluated using the Area Under the Curve (AUC) as well as RI-specific metrics such as Kickout and a novel metric introduced in this work, denoted as Area under the Kickout. Our findings reveal that RI methods, including the proposed framework, generally involve a trade-off between AUC and RI-specific metrics. However, the proposed CI-EX framework consistently outperforms existing RI models from the credit literature in terms of RI-specific metrics while maintaining competitive performance in AUC across most experiments.
- South America > Brazil > São Paulo > Campinas (0.04)
- North America > United States (0.04)
- Europe > Germany > Bavaria > Lower Franconia > Würzburg (0.04)
- Banking & Finance > Loans (1.00)
- Banking & Finance > Credit (1.00)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
Best Practices for Responsible Machine Learning in Credit Scoring
Valdrighi, Giovani, Ribeiro, Athyrson M., Pereira, Jansen S. B., Guardieiro, Vitoria, Hendricks, Arthur, Filho, Décio Miranda, Garcia, Juan David Nieto, Bocca, Felipe F., Veronese, Thalita B., Wanner, Lucas, Raimundo, Marcos Medeiros
For individuals and families, access to affordable credit is essential as protection against financial volatility, financing and education, pursuing business opportunities, and building equity. From the lender's perspective, there is a delicate balance between improving access to credit and higher costs due to defaults on payments. Creating responsible credit concession models requires maintaining this balance [Kozodoi et al., 2022] while ensuring fair outcomes across different groups of individuals, improving access, and helping applicants understand factors that influence rejection so that they can take action to improve their credit potential. Credit concession models are created using a variety of data, such as employment history (for example, occupation and income), demographic data (such as age, marital status, and education), and financial data (for example, checking account balance, credit card usage, and bill payment history). Given these features, models such as logistic regression, gradient boosting, and decision trees can be trained to predict whether a new customer will default on a loan over a period of time [Louzada et al., 2016].
- South America > Brazil > São Paulo > Campinas (0.04)
- Asia > Taiwan (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (4 more...)
- Banking & Finance > Loans (1.00)
- Banking & Finance > Credit (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.92)
- Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.90)
- (3 more...)
Towards a Better Microcredit Decision
Song, Mengnan, Wang, Jiasong, Su, Suisui
Reject inference comprises techniques to infer the possible repayment behavior of rejected cases. In this paper, we model credit in a brand new view by capturing the sequential pattern of interactions among multiple stages of loan business to make better use of the underlying causal relationship. Specifically, we first define 3 stages with sequential dependence throughout the loan process including credit granting(AR), withdrawal application(WS) and repayment commitment(GB) and integrate them into a multi-task architecture. Inside stages, an intra-stage multi-task classification is built to meet different business goals. Then we design an Information Corridor to express sequential dependence, leveraging the interaction information between customer and platform from former stages via a hierarchical attention module controlling the content and size of the information channel. In addition, semi-supervised loss is introduced to deal with the unobserved instances. The proposed multi-stage interaction sequence(MSIS) method is simple yet effective and experimental results on a real data set from a top loan platform in China show the ability to remedy the population bias and improve model generalization ability.
- Asia > China > Beijing > Beijing (0.04)
- South America > Uruguay > Maldonado > Maldonado (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- (4 more...)
- Banking & Finance > Credit (0.71)
- Banking & Finance > Loans (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Data Science (0.94)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Shallow Self-Learning for Reject Inference in Credit Scoring
Kozodoi, Nikita, Katsas, Panagiotis, Lessmann, Stefan, Moreira-Matias, Luis, Papakonstantinou, Konstantinos
Credit scoring models support loan approval decisions in the financial services industry. Lenders train these models on data from previously granted credit applications, where the borrowers' repayment behavior has been observed. This approach creates sample bias. The scoring model (i.e., classifier) is trained on accepted cases only. Applying the resulting model to screen credit applications from the population of all borrowers degrades model performance. Reject inference comprises techniques to overcome sampling bias through assigning labels to rejected cases. The paper makes two contributions. First, we propose a self-learning framework for reject inference. The framework is geared toward real-world credit scoring requirements through considering distinct training regimes for iterative labeling and model training. Second, we introduce a new measure to assess the effectiveness of reject inference strategies. Our measure leverages domain knowledge to avoid artificial labeling of rejected cases during strategy evaluation. We demonstrate this approach to offer a robust and operational assessment of reject inference strategies. Experiments on a real-world credit scoring data set confirm the superiority of the adjusted self-learning framework over regular self-learning and previous reject inference strategies. We also find strong evidence in favor of the proposed evaluation measure assessing reject inference strategies more reliably, raising the performance of the eventual credit scoring model.
- South America > Uruguay > Maldonado > Maldonado (0.05)
- Europe > Germany > Hamburg (0.04)
- Europe > Germany > Berlin (0.04)
- North America > United States (0.04)
Deep Generative Models for Reject Inference in Credit Scoring
Mancisidor, Rogelio A., Kampffmeyer, Michael, Aas, Kjersti, Jenssen, Robert
Credit scoring models based on accepted applications may be biased and their consequences can have a statistical and economic impact. Reject inference is the process of attempting to infer the creditworthiness status of the rejected applications. In this research, we use deep generative models to develop two new semi-supervised Bayesian models for reject inference in credit scoring, in which we model the data generating process to be dependent on a Gaussian mixture. The goal is to improve the classification accuracy in credit scoring models by adding reject applications. Our proposed models infer the unknown creditworthiness of the rejected applications by exact enumeration of the two possible outcomes of the loan (default or non-default). The efficient stochastic gradient optimization technique used in deep generative models makes our models suitable for large data sets. Finally, the experiments in this research show that our proposed models perform better than classical and alternative machine learning models for reject inference in credit scoring.
- South America > Uruguay > Maldonado > Maldonado (0.04)
- Africa > Kenya > Lamu County > Lamu (0.04)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- (7 more...)
- Banking & Finance > Loans (1.00)
- Banking & Finance > Credit (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.81)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)